Moving EL Bubbles with MPI Decomposition#1290
Moving EL Bubbles with MPI Decomposition#1290wilfonba wants to merge 44 commits intoMFlowCode:masterfrom
Conversation
…and MPI communication Features: - Lagrangian bubble movement with projection-based void fraction smearing - Kahan summation for accurate void fraction boundary conditions - Extended MPI communication for moving EL bubbles - New 2D and 3D moving Lagrangian bubble examples - Updated test cases and golden files
…lfong into MovingBubblesFresh-clean
Keep PR's lag_params additions (pressure_force, gravity_force, write_void_evol) with ruff formatting. Keep PR's golden files for Lagrange Bubbles tests (may need regeneration). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
0604652 to
a4e5fb0
Compare
…and MPI communication Features: - Lagrangian bubble movement with projection-based void fraction smearing - Kahan summation for accurate void fraction boundary conditions - Extended MPI communication for moving EL bubbles - New 2D and 3D moving Lagrangian bubble examples - Updated test cases and golden files
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
a4e5fb0 to
c0f6323
Compare
- Fix L(1) denominator in 4th-order Lagrange interpolation: xi(2)->xi(1) - Fix L(4) numerator: exclude xi(4) not xi(5) from product - Add adap_dt_stop to private clause in m_bubbles_EE GPU loop (race fix) - Add #else branch to s_mpi_reduce_int_sum for non-MPI builds - Initialize bubs_glb=0 before conditional MPI reduce - Remove duplicate hyper_cleaning in pre_process namelist - Remove stray sum.png from examples Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Remove case default in f_advance_step that dereferenced absent optional args fVel/fPos (undefined behavior when called from EE path with adap_dt) - Add @:DEALLOCATE for p_send_buff, p_recv_buff, p_send_ids in MPI finalize - Add intent(in) on lag_num_ts in s_initialize_particles_mpi - Add intent(in) on nBub, pos, posPrev in s_add_particles_to_transfer_list - Fix bare 3 -> 3._wp in vel_model=3 force computation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
p_send_buff/p_recv_buff/p_send_ids are only allocated when num_procs > 1 (in s_initialize_particles_mpi). Use allocated() guard to avoid crashing on single-rank MPI runs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Claude Code ReviewIncremental review from: Two files changed since the last review. No new high-confidence issues.
|
|
Claude Code Review Incremental review from: 394e7c3 New findings since last Claude review:
|
…ent beta_vars - Add missing END_GPU_PARALLEL_LOOP in s_enforce_EL_bubbles_boundary_conditions - Fix swapped vapor/gas properties in 2D_moving_lag_bubs and 3D_moving_lag_particles examples (gam_v/gam_g, M_v/M_g, k_v/k_g, cp_v/cp_g, R_v/R_g were all inverted vs reference examples) - Document beta_vars indices: 1=void fraction, 2=d(beta)/dt, 5=energy Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Fix out-of-bounds write in s_write_restart_lag_bubbles: the write loop iterated over n_el_bubs_loc but the buffer was sized for bub_id (the filtered in-domain count). Add particle_in_domain_physical filter to the write loop to match the count loop and allocation. - Initialize write_void_evol, pressure_force, gravity_force in pre_process m_global_parameters (were uninitialized, could contain garbage on case files that omit these parameters). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
196d171 to
d030b8a
Compare
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The 2D_moving_lag_bubs and 3D_moving_lag_particles examples use adaptive time stepping with parameters tuned for their full grid (200x200, 200x200x200). When the test framework reduces the grid to 25x25, the CFL condition forces thousands of internal sub-steps per outer step, causing the simulation to hang. These examples need dedicated non-reduced test cases with appropriate parameters. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Required by lint_param_docs.py (merged in MFlowCode#1327): interface_file, normFac, normMag, g0_ic, p0_ic used by hcid 304/305 hardcoded ICs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Description
This PR adds support for moving Euler-Lagrange bubbles and various force models. Tracer particles that simply follow the background flow are implemented, as are Newton's 2nd Law particles that respond to body forces, pressure gradients, and drag. This PR is marked as a draft for now as there are several things that I need to complete before a proper merge can occur, but I wanted these changes to be visible to other contributors working on related features.
Type of change
Testing
16 rank CPU versus 16 rank GPU
This test was ran using the
examples/3D_lagrange_bubblescreentest case and compares the results for a 16 rank CPU simulation to a 16 rank GPU simulation. The visualization shows the following things:test.mp4
Checklist
See the developer guide for full coding standards.
GPU changes (expand if you modified
src/simulation/)Remaining items before merge
Must do
vel_model > 0paths (tracer and Newton's 2nd Law) intoolchain/mfc/test/cases.py. The2D_moving_lag_bubsand3D_moving_lag_particlesexamples useadap_dtwith parameters tuned for their full grid size and hang when reduced to 25x25 test grids. Dedicated test cases with appropriate parameters are needed.docs/documentation/case.mdfor new parameters (vel_model,drag_model,pressure_force,gravity_force,write_void_evol,input_path,charNz)Should verify
n_el_bubs_glbis rank-0 only afterMPI_REDUCE— confirmed intentional (only rank 0 prints stats)nReal = 7 + 16*2 + 10*lag_num_tsmagic number ins_initialize_particles_mpi— this buffer size must stay in sync with the pack/unpack loops. Consider computing it from named constants or adding a comment mapping each count to its fieldAlready fixed (by review commits)
Correctness fixes:
xi(2)->xi(1), L(4) numeratorxi(4)->xi(5)(m_bubbles_EL_kernels.fpp)case defaultinf_advance_stepremoved — dereferenced absent optionalfVel/fPosargs when called from EE path withadap_dt(m_bubbles.fpp)particle_in_domain_physicalto match allocation size (m_bubbles_EL.fpp)bub_ppthermodynamic params were inverted (2D_moving_lag_bubs,3D_moving_lag_particles)$:END_GPU_PARALLEL_LOOP()ins_enforce_EL_bubbles_boundary_conditions(m_bubbles_EL.fpp)GPU fixes:
adap_dt_stopadded to GPUprivateclause inm_bubbles_EE.fpp(race condition)MPI fixes:
s_mpi_reduce_int_sum#elsebranch for non-MPI builds (m_mpi_common.fpp)bubs_glbinitialized to 0 before conditional MPI reduce (m_mpi_common.fpp)@:DEALLOCATEforp_send_buff,p_recv_buff,p_send_idswithallocated()guard (m_mpi_proxy.fpp)intent(in)added onlag_num_ts,nBub,pos,posPrevin MPI subroutines (m_mpi_proxy.fpp)Code quality fixes:
hyper_cleaningremoved from pre_process namelist (m_start_up.fpp)3->3._wpin vel_model=3 force computation (m_bubbles.fpp)beta_varsindices documented:1=void fraction, 2=d(beta)/dt, 5=energy source(m_constants.fpp)write_void_evol,pressure_force,gravity_forcein pre_process (m_global_parameters.fpp)sum.pngremoved from examplespathvariable shadowing fixed incases.pyTest fixes:
2D_moving_lag_bubsand3D_moving_lag_particlesadded tocasesToSkip— these examples useadap_dtwith parameters tuned for 200x200 grids. When reduced to 25x25, the CFL condition forces thousands of internal sub-steps, causing hangs. Stale golden files removed.Review commit changelog (sbryngelson/Claude Code)
The following commits were added by @sbryngelson via Claude Code review to fix bugs and bring the PR to CI-passing state. All changes are to wilfonba's code; no new features were added.
52adde37— Fix review bugs: Lagrange interpolation, GPU race, MPI, namelistm_bubbles_EL_kernels.fpp:664):(xi(2) - xi(5))corrected to(xi(1) - xi(5))(pos - xi(4))corrected to(pos - xi(5))— must exclude own indexadap_dt_stopadded toprivateclause inm_bubbles_EE.fppGPU parallel loop#else; sum = var_loctos_mpi_reduce_int_suminm_mpi_common.fppbubs_glb = 0added before conditional MPI reduce in both MPI and non-MPI pathshyper_cleaningfrom pre_processm_start_up.fppexamples/3D_moving_lag_particles/sum.png506dccd2— Fix optional arg UB, missing deallocates, missing intentcase defaultinf_advance_step(m_bubbles.fpp) that wrote to absent optional argsfVel/fPoswhen called from the EE path withadap_dt=T@:DEALLOCATEforp_send_buff,p_recv_buff,p_send_idsins_finalize_mpi_proxy_moduleintent(in)onlag_num_tsins_initialize_particles_mpiandnBub,pos,posPrevins_add_particles_to_transfer_list3->3._wpin vel_model=3 force computationf6adae3a— Fix dealloc crash on single-rank MPIif (bubbles_lagrange)toif (allocated(p_send_buff))— the buffers are only allocated whennum_procs > 1(insides_initialize_particles_mpi), so single-rank MPI runs crashed with "Attempt to DEALLOCATE unallocated"fa2bc4e4— Fix GPU directive, example physics, documentation$:END_GPU_PARALLEL_LOOP()after the locate-cell loop ins_enforce_EL_bubbles_boundary_conditions2D_moving_lag_bubs/case.pyand3D_moving_lag_particles/case.py— all 10bub_ppthermodynamic parameters (gam_v/gam_g,M_v/M_g,k_v/k_g,cp_v/cp_g,R_v/R_g) were inverted vs the correct convention in3D_lagrange_shbubcollapsebeta_varsindices inm_constants.fpp99dbd4d0— Fix restart write OOB and uninitialized lag_paramss_write_restart_lag_bubblesallocated buffer with filtered count (bub_id) but wrote loop iterated over total count (n_el_bubs_loc). Addedparticle_in_domain_physicalfilter to the write loop using a separate counteriwrite_void_evol = .false.,pressure_force = .false.,gravity_force = .false.to pre_processm_global_parameters.fppdefault initializationd030b8a4— Remove pylint directives# pylint: disablecomments fromcases.py(replaced by ruff per upstream lint: add pylint directive check and remove dead directives #1323)c0f63238— Formatting and lint cleanup after rebasepathvariable shadowing lint error incases.py(renamed tocase_path)===separator comment that new source linter flags35a6d5bd— Skip hanging example tests2D_moving_lag_bubsand3D_moving_lag_particlestocasesToSkipincases.pyadap_dt=Twithdt=1on domain[-2000,2000]. When the test framework reduces the grid from 200x200 to 25x25, cell size increases from 20 to 160, but sound speed (~1500 m/s) stays the same. The adaptive stepper gets stuck in thousands of sub-steps trying to satisfy CFL, causing an infinite hang.AI code reviews
Reviews are not triggered automatically. To request a review, comment on the PR:
@coderabbitai review— incremental review (new changes only)@coderabbitai full review— full review from scratch/review— Qodo review/improve— Qodo code suggestions